LAUNCHPADS: A System for Processing Ad Hoc Data
نویسندگان
چکیده
An Introduction to PADS. Ideally, any data we ever encounter will be presented to us in standardized formats, such as XML. Why? Because for formats like XML, there are a whole host of software libraries, query engines, visualization tools and even programming languages specially designed to help users process their data. However, we do not live in an ideal world, and in reality, vast amounts of data is produced and communicated in ad hoc formats, those formats for which no data processing tools are readily available. Figure 1 presents a small selection of ad hoc data sources. As one can see, ad hoc data exists in a very wide variety of fields and the users range from network administrators to computational biologists and genomics researchers to physicists, financial analysts and everyday programmers. Programmers often deal with this data by whipping up one-time Perl scripts or C programs to parse and analyze their data. Unfortunately, this strategy is slow and tedious, and often produces code that is difficult to understand, lacks adequate error checking, and is brittle to format change over time. To expedite and improve this process, we developed the PADS data description language and system [2, 3]. Using the PADS language, one may write a declarative description of the structure of almost any ad hoc data source. The descriptions take the form of types, drawn from a dependent type theory. For instance, PADS base types describe simple objects including strings, integers, floating-point numbers, dates, times, and ip addresses. Records and arrays specify sequences of elements in a data source, and unions, switched unions and enums specify alternatives. Any of these structured types may be parameterized and users may write arbitrary semantic constraints over their data as well. Once a programmer has written a description in the PADS language, the PADS compiler can generate a collection of formatspecific libraries in C, including a parser, printer, and verifier. In addition, the compiler can compose these libraries with generic templates to create value-added tools such as an ad hoc-to-XML format conversion tool, a histogram generator, and a statistical analysis and error summary tool. Finally, PADS has been composed with the GALAX query engine [6, 4, 5] for XQuery to create PADX [1], a new system that allows users to query and transform any ad hoc data source as if it was XML, without incurring the performance penalty that usually results when one converts ad hoc data into a much more verbose XML representation. While the PADS language provides an extremely versatile means of creating tools for processing ad hoc data, it is nevertheless a new language and learning a new language is time-consuming for anyone, especially for computational biologists or other scientists for whom programming is not their primary area of expertise. To ease the way for novice PADS users, we developed LAUNCHPADS, a new tool that provides access to the PADS system without requiring foreknowledge of the PADS language itself. Hence, LAUNCHPADS graphic interface will also help more experienced PADS users to shorten their development cycle and provides a conName : Use Representation Web server logs (CLF): Fixed-column ASCII records Measure web workloads CoMon data: ASCII records Monitor PlanetLab Machines Call detail: Fraud detection Fixed-width binary records AT&T billing data: Various Cobol data formats Monitor billing process Netflow: Data-dependent number of Monitor network performance fixed-width binary records Newick: Immune Fixed-width ASCII records system response simulation in tree-shaped hierarchy Gene Ontology: Variable-width ASCII records Gene-gene correlations in DAG-shaped hierarchy CPT codes: Floating point numbers Medical diagnoses
منابع مشابه
ارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملProposing A Distributed Model For Intrusion Detection In Mobile Ad-Hoc Network Using Neural Fuzzy Interface
Security term in mobile ad hoc networks has several aspects because of the special specification of these networks. In this paper a distributed architecture was proposed in which each node performed intrusion detection based on its own and its neighbors’ data. Fuzzy-neural interface was used that is the composition of learning ability of neural network and fuzzy Ratiocination of fuzzy system as...
متن کاملProposing A Distributed Model For Intrusion Detection In Mobile Ad-Hoc Network Using Neural Fuzzy Interface
Security term in mobile ad hoc networks has several aspects because of the special specification of these networks. In this paper a distributed architecture was proposed in which each node performed intrusion detection based on its own and its neighbors’ data. Fuzzy-neural interface was used that is the composition of learning ability of neural network and fuzzy Ratiocination of fuzzy system as...
متن کاملA Hidden Node Aware Network Allocation Vector Management System for Multi-hop Wireless Ad hoc Networks
Many performance evaluations for IEEE 802.11distributed coordination function (DCF) have been previouslyreported in the literature. Some of them have clearly indicatedthat 802.11 MAC protocol has poor performance in multi-hopwireless ad hoc networks due to exposed and hidden nodeproblems. Although RTS/CTS transmission scheme mitigatesthese phenomena, it has not been successful in thoroughlyomit...
متن کاملA New Intrusion Detection System to deal with Black Hole Attacks in Mobile Ad Hoc Networks
By extending wireless networks and because of their different nature, some attacks appear in these networks which did not exist in wired networks. Security is a serious challenge for actual implementation in wireless networks. Due to lack of the fixed infrastructure and also because of security holes in routing protocols in mobile ad hoc networks, these networks are not protected against attack...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006